Monotone Statistical Translation Using Word Groups
نویسندگان
چکیده
A new system for statistical natural language translation for languages with similar grammar is introduced. Specifically, it can be used with Romanic Languages, such as French, Spanish or Catalan. The statistical translation uses two sources of information: a language model and a translation model. The language model used is a standard trigram model. A new approach is defined in the translation model. The two main properties of the translation model are: the translation probabilities are computed between groups of words and the alignment between those groups is monotone. That is, the order between the word groups in the source sentence is conserved in the target sentence. Once, the translation model has been defined, we present an algorithm to infer its parameters from training samples. The translation process is carried out with an efficient algorithm based on stack-decoding. Finally, we present some translation results from Catalan to Spanish and compare our model with other conventional models.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملImprovements in Phrase-Based Statistical Machine Translation
In statistical machine translation, the currently best performing systems are based in some way on phrases or word groups. We describe the baseline phrase-based translation system and various refinements. We describe a highly efficient monotone search algorithm with a complexity linear in the input sentence length. We present translation results for three tasks: Verbmobil, Xerox and the Canadia...
متن کاملReordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation
This paper presents a reordering model using syntactic information of a source tree for phrase-based statistical machine translation. The proposed model is an extension of ISTITG (imposing source tree on inversion transduction grammar) constraints. In the proposed method, the target-side word order is obtained by rotating nodes of the source-side parse-tree. We modeled the node rotation, monoto...
متن کاملPhrase-Based Statistical Machine Translation
This paper is based on the work carried out in the framework of the Verbmobil project, which is a limited-domain speech translation task (German-English). In the final evaluation, the statistical approach was found to perform best among five competing approaches. In this paper, we will further investigate the used statistical translation models. A shortcoming of the single-word based model is t...
متن کاملAccelerated DP based search for statistical translation
In this paper, we describe a fast search algorithm for statistical translation based on dynamic programming (DP) and present experimental results. The approach is based on the assumption that the word alignment is monotone with respect to the word order in both languages. To reduce the search eeort for this approach, we introduce two methods: an acceleration technique to eeciently compute the d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001